These issues need to be taken into consideration during the design stage, since, ultimately, the decisions on how test clocks are designed can have a significant impact on automatic test pattern generation (ATPG). This article discusses the trade-offs among various clock test methodology scenarios and offers food for thought on how to proceed in each scenario.
First and foremost, clock trees are designed with functional operation in mind. Clock control logic may be designed to pulse various clocks in a specific sequence or to ensure that certain clocks are not pulsed at the same time. During functional operation, internal clock generators may also manage internal clock skew in order to ensure proper functional operation. Alternatively, resynchronization logic may be used to control data transfer between clock domains that operate truly asynchronously. During a scan-based manufacturing test, however, the clocks might be operated in a different way, since in test mode the clocks are delivered by the automatic test equipment (ATE) and internal clock generators and controls are bypassed.
Maintain multiple asynchronous clocks in test mode. This is easier to implement but could result in larger test sets.
By using a single clock in test mode, the internal clock signals are bypassed to one external clock signal. From the tester and test generation point of view, the design has one clock. In general, there are both advantages and disadvantages to using this alternative.
Among the advantages of using a single clock in test mode are that only one pin needs to be used as a dedicated test-mode clock pin, that the method ensures the most compact test pattern set and that it provides the shortest ATPG run-times. Since the design has only one clock in test mode, the test generation effort and, therefore, run-times are minimized.
There are some disadvantages as well. The methodology requires very careful clock design and analysis. For mux-DFF designs, the clock tree needs to be synthesized separately for test mode. Even though Clk1 and Clk2 are correctly skewed in functional mode, they will now be clocked at the same time, and clock skew can cause problems during both shift and capture.
Also, scan chain reordering is more complicated since ordering must take into account the clock domains for proper shift operation. What's more, little flexibility is left in resolving potential problems. If clock skew occurs, it is difficult to resolve without modifying the design or losing test coverage.
The second test approach with multiple test clocks is to use one dedicated pin per clock domain. In functional mode, multiple clocks are generated internally. In test mode, each internal clock has a different clock pin. From the ATPG tool's point of view, the design has multiple clocks. All clocks are still clocked at the same time during shift, but the ATPG tool is now free to handle the clock domains in different ways during the capture cycle.
There are clear advantages to using multiple clocks. First, it is the safest approach, since there is no separate "test mode only" clock tree. As long as there are no clock skew problems in functional mode, there should not be any in test mode.
Also, different ATPG methods can be used in generating the tests, and there is flexibility in pulsing each clock separately or at the same time.
Using multiple clocks allows peak power consumption to be reduced because not all clocks are active at the same time during the capture cycles. If the scan chains have lockup latches to prevent skew between domains, it is also possible to "stagger" the clocks during shift mode and, thereby, reduce peak power consumption during the shift operation.
Of course, there are disadvantages to this scheme, too. If all clocks were generated internally, then some additional routing and more pins would need to be dedicated as clock pins during test mode. Nonetheless, it is possible to share test clock pins with functional pins. Such sharing can reduce test coverage, because the pins need to be operated as clocks and not as regular input pins during test, but the coverage reduction is usually insignificant.
Multiple clocks also increase pattern count. Since each pattern will probably not use all of the clocks, the pattern count will usually be larger. Advanced techniques for ATPG that can limit the increase in patterns are discussed later. However, solutions that minimize the increase in pattern count will increase the ATPG run-times.
Standard scan design
The advantages and disadvantages assume a standard mux-DFF style scan design.
Using an LSSD-based scan approach rather than mux-DFF solves many issues related to scan chain shifting. An LSSD scan cell requires three clocks: a system clocks and two nonoverlapping clocks that are used during shift (Fig. 2). This LSSD cell is used to replace a nonscan latch and is suitable for latch-based designs.
Similar cells exist for DFF-based designs. Such LSSD cells have an edge-triggered system clock but use the same nonoverlapping clocks during shift as regular LSSD. Therefore, it is not necessary to have a latch-based design before scan in order to use an LSSD approach. This type of scan design eliminates the skew issues during shift, since two nonoverlapping clocks are used. However, it does not eliminate potential skew problems during capture.
Whether the design uses one clock or multiple clocks during test mode, designers must ensure that clock skew doesn't cause problems during shift. As discussed, this can be accomplished by using an LSSD scan methodology or through correct ordering of the scan chains and usage of lockup latches.
If multiple clock domains are combined into one test clock, then careful attention also must be paid to the clock tree design to ensure that there will be no clock skew problems during the capture cycle. Designs with multiple test clocks can make use of several ATPG techniques to ensure that the final test patterns will work properly in manufacturing regardless of any skew issues between clock domains.
The traditional way to ensure that clock skew does not cause problems in the capture cycle is to pulse only one clock per pattern. During shift, all clocks are pulsed at the same time, but only one clock is selected per pattern. In this example, each scan chain has two scan cells, and there is a total of four scan clocks.
This is the easiest way to avoid clock skew during capture. Little ATPG effort is needed, which results in short run-times. The trade-off is a high pattern count: For a design with 10 clocks, one can experience up to 10 times more patterns than if the design had one clock.
Another approach is first to route all clocks to separate inputs in test mode and then analyze which clock domains are independent (i.e., which ones have no functional paths between them).
If the ATPG tool is capable of performing the analysis, then it can treat independent clock domains as one. That will yield the same effect as using one pin for these clocks. In the example shown in Fig. 3, analysis shows that there are no functional paths between clock domains 1 and 3. Therefore, for Pattern 1, clocks TClk1 and TClk3 can be pulsed at the same time. Since there is interaction between other domains, clock TClk2 is pulsed alone for Pattern 2.
A low pattern count is generated, and if there is a problem between certain domains, one can use a different approach without changing the hardware. But this approach requires analysis to determine which clocks can be pulsed at the same time without causing problems. Some ATPG tools automate this analysis.
Pulsing clocks sequentially results in the most compact pattern set without the risk of clock skew causing problems in the capture cycle. The capture takes place over multiple cycles. Only one clock is pulsed per cycle. Since the same time plate is used for all cycles, no additional time plates or other settings are required. This method is used by the Mentor Graphics FastScan tool's multiclock compression.
Expensive advantages
The advantages include a reduced pattern count (compared with pulsing one clock per pattern); very good compression results, with a sequential depth of only 2 or 3, even for designs with many clocks; and no risk of clock skew during capture. These advantages do come at the expense of an increased ATPG run-time.
Multiclock compression resolves the test set size dilemma that using multiple clocks in test mode usually creates while avoiding the issues of clock skew. One example of an SoC-type design shows how.
The design, provided by Tality, has a large number of clock domains-14-and the number of flops associated with each clock varies dramatically. There is one clock with more than 5 kflops, there's another with 2k and a couple with approximately 1k. The remaining clocks have between 20 and 200 flops each.
The accompanying graph (Fig. 4) plots the number of vectors produced against the CPU time for several ATPG runs. The "synchronous" curve represents a purely synchronous implementation, in which all clocks are switched to a single source. The different points on the curve represent increased effort levels during dynamic compression, which increases run-time but can reduce the number of vectors.
The "multiclock" curve represents an asynchronous implementation of the design in which clock sources remain separate to avoid any possible timing hazards. The first (left-most) point represents conventional ATPG with only a single capture cycle for each test vector. The successive points are for runs that allow an increasing number of capture clocks for each test pattern. This increases the CPU time but decreases the final vector count. The final pattern count is very close to what is possible with a single clock.
We recommend that for mux-DFF based designs, the best solution for designs with multiple clock domains is to let each internal clock domain have a clock pin in test mode. That will provide the most options for test pattern generation and eliminate the need to design a special clock tree just for test. To achieve the most compact pattern set, clock domain analysis should be performed to determine which domains do not interact, so that ATPG can pulse these clocks simultaneously when patterns are generated.
Finally, the remaining clocks should be pulsed sequentially using a multiclock compression technique. These methods, combined, will result in the safest approach to a minimized pattern set.
---
Richard Illman, chief consulting engineer at Tality Corp. (Livingston, Scotland), is a graduate of Scotland's University of Hull. Greg Aldrich is product marketing manager at Mentor Graphics Corp. (Beaverton, Ore.). He holds a BEE degree from the University of IIlinois.
http://www.isdmag.com
Copyright © 2002 CMP Media LLC
4/1/02, Issue # 14154, page 30.